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Abstract 

Introduction: The female sex steroids estrogen and progesterone are important in breast cancer etiology. It 
therefore seems plausible that variation in genes involved in metabolism of these hormones may affect breast 
cancer risk, and that these associations may vary depending on menopausal status and use of hormone 
therapy. 

Methods: We conducted a nested case-control study of breast cancer in the California Teachers Study cohort. We 
analyzed 317 tagging single nucleotide polymorphisms (SNPs) in 24 hormone pathway genes in 2746 non-Hispanic 
white women: 1351 cases and 1395 controls. Odds ratios (ORs) and 95% confidence intervals (CIs) were estimated 
by fitting conditional logistic regression models using all women or subgroups of women defined by menopausal 
status and hormone therapy use. P values were adjusted for multiple correlated tests (Pact)- 

Results: The strongest associations were observed for SNPs in SLC01B1, a solute carrier organic anion transporter 
gene, which transports estradiol-1 7p-glucuronide and estrone-3-sulfate from the blood into hepatocytes. Ten of 38 
tagging SNPs of SLC01B1 showed significant associations with postmenopausal breast cancer risk; 5 SNPs 
(rsl 1045777, rsl 1045773, rs16923519, rs4149057, rsl 1045884) remained statistically significant after adjusting for 
multiple testing within this gene (P ACT = 0.019-0.046). In postmenopausal women who were using combined 
estrogen-progestin therapy (EPT) at cohort enrollment, the OR of breast cancer was 2.31 (95% CI = 1.47-3.62) per 
minor allele of rs4149013 in SLC01B1 {P = 0.0003; within-gene P ACT = 0.002; overall P ACT = 0.023). SNPs in other 
hormone pathway genes evaluated in this study were not associated with breast cancer risk in premenopausal or 
postmenopausal women. 

Conclusions: We found evidence that genetic variation in SLC01B1 is associated with breast cancer risk in 
postmenopausal women, particularly among those using EPT. 
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Introduction 

Reproductive and hormonal factors, including age at 
menarche, parity, number of full-term pregnancies, age at 
first full-term pregnancy, breastfeeding, age at meno- 
pause, body mass index (BMI), and physical activity, are 
associated with breast cancer risk [1,2]. Consistent with 
these observations, breast cancer risk is higher among 
women with higher circulating levels of endogenous 
estrogen [3-5] and among women using combined post- 
menopausal estrogen and progestin therapy (EPT) [6-11]. 

Sex steroid hormones, whether endogenous or exo- 
genous, are synthesized and metabolized by many differ- 
ent enzymes (reviewed in [12]). Therefore, genetic 
variation among genes regulating sex steroid hormone 
levels may increase or decrease breast cancer risk by 
influencing hormone metabolism. Polymorphisms in 
several hormone pathway genes, including CYP19A1 
and COMT, have been associated with endogenous hor- 
mone levels [13-16]; however, studies investigating the 
association of genetic variation in hormone metabolism 
genes and breast cancer risk have generated mixed 
results [14,17-29]. Recently, a large study from the 
Breast and Prostate Cancer Cohort Consortium (BPC3) 
comprehensively analyzed 37 steroid hormone metabo- 
lism pathway genes in relation to breast cancer risk and 
reported null associations [16,30], suggesting that incon- 
sistencies in the literature may be due to findings 
observed by chance in small studies. However, it is pos- 
sible that the inconsistencies may be explained at least 
partly by differences in the distribution of environmental 
factors that modify the effects of genetic polymorphisms. 

EPT use increases breast cancer risk to a much greater 
extent than estrogen-only therapy (ET) [6-11]. There- 
fore, it is important to examine EPT and ET use sepa- 
rately when investigating gene-hormone therapy (HT) 
interactions. To date, few studies have investigated the 
gene-HT interactions by hormone formulation by using 
a comprehensive single-nucleotide polymorphism 
(SNP)-tagging approach. The California Teachers Study 
is an effective resource to study these questions because 
detailed data on hormone use were collected at baseline, 
and approximately 41% and 28% of the postmenopausal 
participants reported current use of EPT and ET, 
respectively [31]. Thus, using data from a case-control 
study nested within the California Teachers Study, we 
systematically investigated whether any of 24 hormone 
metabolism pathway genes or their interactions with HT 
were associated with breast cancer risk. 

Materials and methods 

Participants 

The California Teachers Study has been previously 
described in detail [32]. Briefly, the California Teachers 



Study is a prospective cohort of women who were cur- 
rent, recent, or retired California public school profes- 
sionals in 1995. By returning a baseline questionnaire in 
1995-1996, 133,479 women joined the cohort and pro- 
vided detailed information on menopausal status, HT 
use, and other lifestyle and medical factors. The baseline 
questionnaire is available on the California Teachers 
Study website [33]. Cancer diagnoses in the cohort are 
identified through annual linkage with the California 
Cancer Registry, which identifies at least 99% of cancers 
diagnosed in California [34]. The California Teachers 
Study has been approved by the institutional review 
boards at all participating institutions: the Cancer Pre- 
vention Institute of California (CPIC), the University of 
California at Irvine (UCI), the University of Southern 
California (USC), and the City of Hope in accordance 
with assurances filed with and approved by the US 
Department of Health and Human Services. 

The nested breast cancer case-control study was 
designed to obtain biospecimens from breast cancer 
cases and unaffected controls within the 113,590 mem- 
bers of the cohort who were less than 80 years old at 
baseline, had continued residence in California during 
the study period (1995 to time of blood draw), and, 
before 1998, had no prior history of invasive or in situ 
breast cancer. Cases were women who had a histologi- 
cally confirmed invasive primary carcinoma of the breast 
(International Classification of Disease for Oncology 
code C50 restricted to morphology codes under 8,590) 
and who were 80 years old or younger between 1 Janu- 
ary 1998 and 31 May 2007. One control participant per 
case was randomly selected from the cohort and fre- 
quency-matched to the case on age at baseline (within 
5-year age groups), self-reported race/ethnicity (white, 
African-American, Latina, Asian, and other), and three 
broad geographic regions (that is, California Teachers 
Study specimen collection centers). Cancer cases were 
identified through quarterly linkages of the cohort to 
the California Cancer Registry database. Control selec- 
tion was conducted on a quarterly basis without replace- 
ment. For each wave of control selection, a reference 
date was determined (that is, January, April, July, and 
October of each year). Nearly equal numbers of controls 
were selected in each wave. One control participant had 
her breast cancer diagnosed after her control selection 
and was included in the analysis as both a control and a 
case. 

Collection of biological specimens and DNA extraction 

Collection of biological specimens was conducted at 
three study centers (CPIC in the northern half of Cali- 
fornia and USC and UCI in the southern half). Women 
who declined blood draw were asked whether they were 
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willing to provide a saliva sample, and, if so, an Oragene 
DNA self-collection kit (DNA Genotek, Kanata, ON, 
Canada) was mailed to the participant with informed 
consent and return postage paid mailing materials. 
From the 8,118 eligible participants (2,618 cases and 
5,500 selected controls), we collected biological speci- 
mens for 74% of the cases (1,923 cases: 1,684 blood spe- 
cimens and 239 saliva specimens) and 61% of the 
controls (3,350 controls: 3,012 blood specimens and 338 
saliva specimens). All biologic samples were sent via 
overnight courier to the UCI laboratory for DNA extrac- 
tion. DNA was extracted from blood clots by using Qia- 
gen Clotspin Baskets and DNA QIAmp DNA Blood 
maxi kits (Qiagen Inc., Valencia, CA, USA) in accor- 
dance with Qiagen protocols. DNA was extracted from 
saliva samples by using the Oragene protocol (DNA 
Genotek). The nested breast cancer case-control study 
of the California Teachers Study has been approved by 
the institutional review boards at all participating insti- 
tutions, and all participants provided written informed 
consent. 

Tagging SNP selection and genotyping 

We investigated 24 genes that are involved in female 
sex steroid hormone biosynthesis, metabolism, or 
excretion. Reviews on the function of these genes are 
available elsewhere [12,35,36]. For 21 of these genes, a 
tagging SNP approach was used (Supplementary Table 
SI in Additional file 1). For 16 of the 21 genes, we 
selected linkage disequilibrium tagging SNPs across 
each gene, 20 kb upstream of 5' untranslated region 
(UTR) and 10 kb downstream of 3' UTR. The tagging 
SNPs were selected to capture all common SNPs 
(minor allele frequency (MAF) of at least 5%) in indivi- 
duals of European ancestry with minimum pairwise R 2 
of at least 0.80 by using the Snagger software [37] and 
the data from the International HapMap Project for 
the white CEPH (Utah residents with ancestry from 
northern and western Europe) population (HapMap 
release 21, July 2006, genotype build 36 [38]). Tagging 
SNPs for five genes included in the present study had 
been selected by BPC3 by using the TagSNPs program 
[30,39]. To facilitate the comparison across studies, we 
used the BPC3-selected tagging SNPs for these five 
genes (Supplementary Table SI in Additional file 1). 
The BPC3-selected tagging SNPs captured all common 
SNPs (MAF of at least 5%) in whites with minimum 
pairwise R of at least 0.8. For the remaining 3 of the 
24 genes, we genotyped a few selected SNPs due to 
space limitations of our genotyping platform. For 
CYP19A1, the selected SNPs were shown to be asso- 
ciated with circulating estrogen concentrations in a 
comprehensive analysis [13] (Supplementary Table SI 
in Additional file 1). 



A total of 1,751 breast cancer cases and 1,697 controls 
were available for genotyping. We included a random 
sample of 193 replicates (105 cases and 88 controls) to 
monitor reproducibility and track plate flips or switches. 
The DNA samples were genotyped for the selected tag- 
ging SNPs by using the Illumina Golden Gate Assay 
(Illumina, Inc., San Diego, CA, USA) in the USC Core 
Facility. About 10% of the genotyped samples, including 
189 cases and 150 controls, had a genotyping success 
rate (call rate) of less than 90% and were excluded from 
the analyses. The genotyping concordance rate based on 
the 160 duplicate samples with a call rate of at least 
90% was 99.9%. Call rates were lower when the DNA 
was obtained from saliva samples: 23% of saliva samples 
and 8% of blood samples had a call rate of less than 
90% and these samples therefore were excluded from 
the analyses. However, the genotyping concordance after 
excluding the low call rate samples was excellent for sal- 
iva samples (greater than 99.9%). In addition, results 
from sensitivity analyses excluding saliva samples were 
similar to those using all samples. For the present study, 
we also excluded 88 women (52 cases and 36 controls) 
who self-reported to have had a previous history of can- 
cer, leaving 1,510 cases and 1,511 controls. Because the 
majority (approximately 91%) of participants were non- 
Hispanic whites, we restricted the analyses to 2,746 
non-Hispanic white women (1,351 cases and 1,395 
controls). 

Of the 355 SNPs genotyped, 332 had an SNP call rate 
of at least 90%. We excluded an additional three SNPs 
that had discordant readings in more than two duplicate 
pairs, eight SNPs with an MAF of less than 1% among 
non-Hispanic white controls, and four SNPs in COMT, 
CYP11A, SULT1A1; SULT1A2, UGT1A8 not in Hardy- 
Weinberg equilibrium (P < 0.001); thus, we analyzed 
317 SNPs in the present study. 

For 19 of the 20 genes for which we used the tagging 
SNP approach, the genotyped tagging SNPs efficiently 
captured 70% to 100% of all common SNPs (MAF of 
greater than 5%) in the HapMap dataset of European 
ancestry (HapMap release 24, genotype build 36) with 
pairwise R 2 of at least 0.80 (Supplementary Table SI in 
Additional file 1). We did not have sufficient tagging 
coverage for CYP21A2. 

Imputation 

We imputed SNPs in gene regions where we found a 
statistically significant association (P < 0.01 before mul- 
tiple testing correction) with breast cancer risk among 
all women or among subgroups defined by menopausal 
status. To do this, we used publicly available HapMap 
genotype data in the CEPH population as the reference 
sample (HapMap release 24, genotype build 36 [38]) and 
MACH 1.0 [40]. We excluded imputed SNPs when the 
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MAF was less than 1% or when the R 2 was less than 
0.30 [40]. 

Statistical analyses 

We used conditional logistic regression models with 
strata defined by 5-year age group and the three speci- 
men collection centers to estimate the odds ratios 
(ORs), 95% confidence intervals (CIs), and P values asso- 
ciated with each SNP by using log-additive models. The 
results did not change after further adjustment for 
potential confounders including menopausal status (pre- 
menopausal, postmenopausal, and unknown), HT use 
status at baseline (never used HT, currently using ET, 
currently using EPT, used HT in the past, and 
unknown), BMI (less than 25, 25 to less than 30, at least 
30 kg/m , and unknown), parity (0, 1 to 2, at least 3, 
and unknown), and oral contraceptive use (never, ever, 
and unknown). Therefore, we presented the results from 
the conditional logistic regression models not adjusting 
for these potential confounders. 

We performed subgroup-specific analyses by meno- 
pausal status and, among postmenopausal women, by 
HT use at baseline, defining the groups of interest as 
never used HT, currently using ET, and currently using 
EPT. We calculated P for interaction by likelihood ratio 
test comparing the model with and without the product 
term of genotype (0, 1, and 2 copies of minor allele, as 
ordinal variable) and menopausal status or HT use. The 
interaction tests for HT use were done separately for 
current EPT use (as compared with never HT use) and 
for current ET use (as compared with never HT use). 

Results 

A greater proportion of breast cancer cases than con- 
trols were currently using EPT at baseline; a lower pro- 
portion of cases than controls had high parity. Cases 
also had slightly earlier age at menarche than control 
women (Table 1). 

Evaluation of the q-q plot of the P values for the asso- 
ciation between the 317 SNPs in the 24 hormone meta- 
bolism genes and breast cancer risk showed no evidence 
of systematic bias (Supplementary Figure SI in Addi- 
tional file 2). 

We observed statistically significant associations at a 
P value of less than 0.01 with two SNPs in SLCOIBI 
and one SNP in HSD17B4 in the overall analysis 
(Table 2). However, after multiple comparisons were 
corrected for, none of these associations was statisti- 
cally significant. For postmenopausal women, 10 SNPs 
in SLCOIBI were associated with breast cancer risk 
with an uncorrected P value of less than 0.01. Of 
these, the associations for SNPs rsll045773, 
rsll045777, rsl6923519, rs4149057, and rsll045884 
remained statistically significant after correction for 



multiple testing within the gene (within-g ene 1 act ^ 
0.05). However, after multiple testing across all genes 
was corrected for, none of these associations was sta- 
tistically significant. The ORs and 95% CIs associated 
for all tested SNPs are provided in Supplementary 
Table S2 in Additional file 3. There was some evidence 
that, of these, rs4149013 in SLCOIBI was associated 
with breast cancer risk in postmenopausal women (OR 
1.39, 95% CI 1.07 to 1.81; uncorrected P = 0.015). 

When examining by HT use, we observed a strong 
association between several SNPs in SLCOIBI and post- 
menopausal breast cancer risk in current EPT users 
(Table 3). Breast cancer risk among postmenopausal 
women who were using EPT at baseline increased more 
than twofold per minor allele of rs4149013 (OR 2.31, 
95% CI 1.47 to 3.62; P = 0.0003, within-gene P ACT = 
0.002). This association was statistically significant even 
after Pact adjustment across all SNPs studied (Pact = 
0.023). The P value for interaction (EPT versus never 
HT use) was 0.019 (not corrected for multiple testing). 
When we combined the homozygous and heterozygous 
minor allele carriers (that is, a dominant genetic model), 
we observed similar OR estimates and P values (OR 
2.43, 95% CI 1.53 to 3.85; P = 0.0002) (Supplementary 
Table S3 in Additional file 4). We did not observe any 
significant associations in never HT users and ET users. 
There was no statistically significant difference in effects 
when stratifying by estrogen receptor status. 

Discussion 

In this case-control study nested within the California 
Teachers Study cohort, genetic variation in only 1 
(SLCOIBI) of 24 genes in the hormone metabolism 
pathway genes was associated with breast cancer risk. 
SLCOIBI, a gene involved in the hepatic uptake of 
female sex steroids, seemed to be associated with breast 
cancer risk among postmenopausal women. This asso- 
ciation was statistically significant after correcting for 
multiple testing within the gene but was not statistically 
significant after we corrected for multiple testing across 
genes. However, there was also an indication that EPT 
may interact with SNPs in SLCOIBI; one variant in 
SLCOIBI (rs4149013) was statistically significantly asso- 
ciated with breast cancer risk in EPT users. 

Our findings of no association between SNPs in hor- 
mone metabolism pathway genes and breast cancer risk 
are consistent with the results from other large studies 
such as BPC3 [13,29,30,41] and meta-analyses of 
selected functional SNPs in CYP1A1 [23], SULT1A1 
[42,43], CYP1B1 [44,45], and COMT [46], although two 
smaller meta-analyses of selected functional SNP in 
COMT supported an association in Caucasian popula- 
tions [47,48]. Although a few studies have suggested 
associations between genetic polymorphisms in other 
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Table 1 Characteristics of the study participants at time of joining the cohort 



Control 



Case 



Characteristics 



Number 



Percentage 



Number 



Percentage 



Number 

Mean age ± SD, years 
Menopausal status 

Premenopausal 

Postmenopausal 3 

Unknown 

HT use among postmenopausal women 

Never used HT 

Current ET use 

Current EPT use 

Former ET or EPT use 

Ever used progestin alone 

Unknown 
Parity/total number of FTPs 

Nulligravid 

Gravid, nulliparous 

1 FTP 

2 FTPs 

3 FTPs 
4+ FTPs 
Unknown 

Body mass index 

< 25 kg/m 2 

25 to < 30 kg/m 2 

30+ kg/m 2 

Unknown 
Age at menarche 

<10 years 

11-12 years 

13-14 years 

15-16 years 

17+ years 

Unknown/Never had menarche 
Family history of breast cancer (first-degree relative) 
No 
Yes 

Unknown 
History of breast biopsy 
No 
Yes 

Screening mammograms within last 2 years 
No 
Yes 

Unknown 



1,395 
56.1 ± 9.5 

347 
1,024 
24 

153 
277 
401 
122 

13 

58 

221 

61 

159 

487 

274 

172 

21 

765 
387 
195 



100 
600 
565 
102 
10 



,167 



39 



Ml;' 



138 

1,241 



25.3 
74.7 



15.8 
28.7 
41.5 
12.6 
1.3 



4.4 
11.6 
35.4 
20.0 
12.5 



54.8 
27.7 
14.0 



7.3 
43.6 
41.0 
7.4 
0.7 



I 3.9 



77.8 
22.2 



10.0 
90.0 



1,351 
55.0 ± 9.4 

364 
962 
25 

112 
230 
468 
80 
12 
60 

219 
66 
1 74 
472 
277 
128 
15 

778 
384 
159 
30 

100 
599 
542 

82 

13 

15 

1,085 
236 
30 

1,034 
317 

146 
1,193 
12 



27.5 
72.5 



1 2.4 
25.5 
51.9 



1 6,1 
4.9 
13.0 
35.3 
20.7 
9.6 



58.9 
29.1 
12.0 



7.5 
44.8 
40.6 
6.1 
1.0 



82.1 
17.9 



76.5 
23.5 

10.9 
89.1 



includes perimenopausal women. EPT, estrogen progestin combined therapy; ET, estrogen therapy; FTP, full-term pregnancy; HT, hormone therapy; SD, standard 
deviation. 
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hormone pathway genes, including CYP11A [25,26], 
CYP1A1/CYP1A2 [49,50], CYP1B1 [51,52], SULT1E1 
[53], or COMT [54], many of these associations were 
not observed consistently [30,49,50,55-57]. In addition, 
the few studies other than BPC3 that have investigated 
polymorphisms in CYP2C9 [51], CYP3A4 [49,51,56], 
HSD17B2 [58], SRDSA1 [56], and UGT2B7 [51], in rela- 
tion to breast cancer risk among Caucasian populations, 
have reported no associations. However, a recent study 
using admixture maximum likelihood (AML)-based glo- 
bal tests reported that genetic variation in androgen- 
estrogen conversion pathway was associated with breast 
cancer risk, although no single SNP was significant after 
correcting multiple testing [59]. 

To our knowledge, no studies have investigated 
genetic variation in SLCOIBI and breast cancer risk by 
hormone therapy use. SLCOIBI, also known as OATP- 
C or SLC21A6, is expressed in the liver and plays an 
important role in transporting drugs and endogenous 
substrates from the blood into the hepatocytes (reviewed 
in [60]). Endogenous substrates of SLCOIBI include 
steroid hormone conjugates such as estradiol- 17|3-glu- 
curonide and estrone-3-sulfate [61,62]. Serum estrone 
sulfate (E1S) is a major form of circulating estrogen in 
postmenopausal women and can be converted to estra- 
diol in breast tissue [63]. E1S is also a major component 
of conjugated equine estrogens, the estrogen component 
of the predominant (prior to 2002) HT regimens in the 
US [64]. Genetic variation in SLCOIBI has been shown 
to decrease the uptake of E1S and estradiol glucuronide 
in several [61,65] but not all [66] studies. Furthermore, 
one study has shown that genetic variation in SLCOIBI 
is associated with blood E1S levels in Caucasians [61], 
suggesting that genetic variation in SLCOIBI may inter- 
act with HT use. Rs4149013 is located near the 5' end 
of SLCOIBI. The functional significance of this variant 
is not known, but even if there is none, this variant 
could be linked to a causal allele. 

In the publicly available Cancer Genetic Markers of 
Susceptibility (CGEMS) breast cancer data [67], we 
found additional support implicating SLCOIBI. In 
CGEMS, five genotyped SNPs in SLCOIBI (rs704166, 
rs852550, rs852549, rs7489119, and rs2306283) were 
associated with breast cancer risk with a P value of less 
than 0.05. These 5 SNPs, as imputed genotypes, were 
null in our dataset (data not shown), but this could be 
due to misclassification from imputation or false-posi- 
tive associations across both CGEMS and our data. 
However, our findings and those of CGEMS, combined 
with the previous literature on the role of this gene in 
affecting estrone absorption, suggest that further investi- 
gation of the role of SLCOIBI genetic variation and its 
interaction with EPT on breast cancer risk is warranted. 



The strengths of this study include the systematic 
investigation of a large number of hormone metabolism 
genes and the detailed information on HT use collected 
at baseline. A limitation of our study was the inability 
to genotype all tagging SNPs for several of the genes of 
interest, including AL<R1 C4, ARSC, and CYP19A1. Thus, 
we cannot exclude the possibility that the lack of asso- 
ciations for these loci was due to incomplete tagging. 
Overall, we had 80% statistical power to detect ORs 
ranging from 1.17 to 1.40 for SNPs with an MAF of 
0.05 to 0.49 by using log-additive models and an alpha 
of 0.05. For the subset analyses among postmenopausal 
women, the minimum detectable OR ranged from 1.20 
to 1.46 for SNPs with an MAF of at least 0.05. The sta- 
tistical power to detect associations in premenopausal 
women or to detect interactions with menopausal status 
or HT use was limited. Another limitation of this study 
is that HT use status assessed at baseline may have 
changed during follow-up. The participation rates 
(donating biological specimens for this nested case-con- 
trol study) among the potentially eligible cohort mem- 
bers were moderate (74% for cases and 61% for 
controls). However, it is unlikely that the participation 
was differential according to genotype and case status, 
and thus selection bias is unlikely to have influenced 
our findings. 

Conclusions 

Common genetic variations in SLCOIBI may be asso- 
ciated with breast cancer risk in postmenopausal 
women, particularly in EPT users. The known effects of 
variants in SLCOIBI on estrogen metabolism suggest 
that further study of the role of SLCOIBI is warranted. 
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